Methodical Evaluation of Arabic Word Embeddings
نویسندگان
چکیده
Many unsupervised learning techniques have been proposed to obtain meaningful representations of words from text. In this study, we evaluate these various techniques when used to generate Arabic word embeddings. We first build a benchmark for the Arabic language that can be utilized to perform intrinsic evaluation of different word embeddings. We then perform additional extrinsic evaluations of the embeddings based on two NLP tasks.
منابع مشابه
A Comparison of Word Embeddings for the Biomedical Natural Language Processing
Background Neural word embeddings have been widely used in biomedical Natural Language Processing (NLP) applications as they provide vector representations of words capturing the semantic properties of words and the linguistic relationship between words. Many biomedical applications use different textual resources (e.g., Wikipedia and biomedical articles) to train word embeddings and apply thes...
متن کاملWord Embeddings and Convolutional Neural Network for Arabic Sentiment Classification
With the development and the advancement of social networks, forums, blogs and online sales, a growing number of Arabs are expressing their opinions on the web. In this paper, a scheme of Arabic sentiment classification, which evaluates and detects the sentiment polarity from Arabic reviews and Arabic social media, is studied. We investigated in several architectures to build a quality neural w...
متن کاملWord-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings
One of the most important problems in machine translation (MT) evaluation is to evaluate the similarity between translation hypotheses with different surface forms from the reference, especially at the segment level. We propose to use word embeddings to perform word alignment for segment-level MT evaluation. We performed experiments with three types of alignment methods using word embeddings. W...
متن کاملGeographical Evaluation of Word Embeddings
Word embeddings are commonly compared either with human-annotated word similarities or through improvements in natural language processing tasks. We propose a novel principle which compares the information from word embeddings with reality. We implement this principle by comparing the information in the word embeddings with geographical positions of cities. Our evaluation linearly transforms th...
متن کاملEvaluating word embeddings with fMRI and eye-tracking
The workshop CfP assumes that downstream evaluation of word embeddings is impractical, and that a valid evaluation metric for pairs of word embeddings can be found. I argue below that if so, the only meaningful evaluation procedure is comparison with measures of human word processing in the wild. Such evaluation is non-trivial, but I present a practical procedure here, evaluating word embedding...
متن کامل